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This paper is concerned with the identification of a fairly general class 
of nonlinear operators using corrupted measurements. A precise mathe- 
matical definition of identification is presented and the relationship between 
a priori information and identification is studied. The a priori information 
is represented as a subset of a metric space of nonlinear operators. Neces- 
sary and sufficient conditions are developed to answer the question "When is 
identification possible?" 

I. INTRODUCTION 

A large body of literature already exists for the problem of identifying 
a control system or communication channel with noisy measurements. 
In the usual identification problems, a certain structure is assumed at 
the outset in order to reduce the identification problem to one of param- 
eter estimation. The absence of such parametrization increases the 
difficulty of the problem substantially. It is often not clear if identifica- 
tion is even possible. 

In this paper we are concerned with the determinability (identifi- 
ability) of quite general nonlinear operators whose outputs are corrupted 
by additive gaussian noise. We introduce a norm on this space of non- 
linear operators and define precisely what we mean by determinability. 
Loosely speaking, we say that we can determine an operator H if we 
can choose a finite observation interval [0, T], a test signal with con- 
strained peak value over this interval, a finite set of linear measurements 
over [0, T), and an estimate A of H which is a continuous function 
of our measurements such that fl is close to H in norm with high proba- 
bility. 

The question of determinability is of course intimately related to the 
kind of a priori knowledge one has of the operator. We represent this 
a priori information by saying that the operator H belongs to a subset 
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2D of possible operators. We derive conditions on 20 which are sufficient 
for determinability. We also show that most of these conditions are in 
fact necessary for the determination of H . 

Our results are motivated by the work on the determinability of 
noiseless channels done by Root, Prosser, and Varaiya. 1 " 4 They derive 
necessary and sufficient conditions to estimate a noiseless channel 
closely with a "one-shot" experiment. These conditions are similar to 
those presented here. Some work on the noisy problem has been done by 
Root. 5 His approach and results are fundamentally different than those 
presented in this paper. Root investigated a class of stochastic nonlinear 
operators represented by a Volterra series whose kernels are gaussian 
random variables. He derived necessary and sufficient conditions for 
the second moments of the kernels to be determinable. 

II. PRELIMINARIES 

The types of channels to be considered can be described as follows. 
The input signal x and observed signal w are related via the operator 
equation 

w(t) = [Hx](t) +2(0 tt[0, =0) (1) 

where H is an operator and z is zero mean white gaussian noise + with 
covariance Ez(t)z(r) = 8(t — t). (The colored noise case will be treated 
separately in Section V.) 

We constrain our input functions x to have peak value less than s, 

t The noiBe term z(t) in equation (1) must be interpreted symbolically Bince white 
noise cannot be parametrized with a time variable, but must properly be param- 
etrized with an element of a Bpace of "testing functions." However, we deal only 
with functionals of w(t) of the form 



L 



b 

w(t)<t>(t) dt, 



where <f> e L 2 (0, b), or with quantities derivable from these functionals. Hence we 
can define 



i 



b 

z(t)4>(l) dt 



to mean 

<t>(t) dt(t) 



r 



where f(<) is Brownian motion and the operations to be performed are readily 
justified. 
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that is, 

x t L M (s) = {a; | a; is a real valued measurable function on [0, °°) 

and | x(t) | ^ s for all £ t [0, oo ) } . 

If we let || 1 1 2 denote the norm on L 2 [0, w) and define the projection 
operator P T by 

[P,a:](/) = x(t) for / ^ T 

= for / > T 

then 

|| p r . r || 2 = (J x \t) dt) £ s(T) 1 for all x e L n (s). 

The types of operators which we consider are assumed to belong to 
the space 3C. The space ,TC is defined: if H e 3C then 

(0 tf : L,(s) -» L 2i 

where L 2e = {?/ | 2/ is a real valued, measurable function on 

[0, »), || Pr2/ || 2 < « for all T > 0}, 

(**) ff is causal; that is, for all T > 0, a: c L.(«), P 7 //:r = P r UP T x, 

(Hi) || # || < °o. 

Using the usual definitions of addition of operators and multiplication 
by scalars, the norm of H, || // || is defined as: 

II #11= SU P ' IIP f ll ■ 

r>o | | r T^ | 1 2 

itL.li) 
I \Pti\ Ih«0 

We consider H to be the zero operator* if || H || = 0. It is then easy 
to show that || || satisfies the norm axioms. Obviously || H || ^ for 
all H t 30. and || \H || = | X | || H || for all scalars X. The triangle in- 
equality is also satisfied since 

i rr . v u \\Pr(Hx + Kx) IL ||P r i/.f + P r K:i- IL 
|| ff + K || = sup -u Vrp— -n LL " = sup -u |,p n ^ 



t The equivalence classes denned in this manner are not unreasonable. In fact, 
if || H || = then || P T Hx ||, = for all x t !/«,(«), || P T x ||, ^ and all T > 0. 
As far as we are concerned this is the zero operator since Hx is then the zero function 
in the L 2 (0, oo ) sense. 
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[\\ P T Hx \\ 2 + \\P T Kx \\ 2 ~\ 

= sup L hp^ J 

^ sup ^^ + sup i^k =( | // „ + „ i ,„ 

| | JT T X I 1 2 II r T X | \2 

where the supremums are taken over all T > 0, x t /^(s), [| P T x \U 9^0. 
If we consider the metric induced by the norm 1 1 1 1 then JC is a com- 
plete metric space. The proof of this proposition is contained in the 
appendix. The completeness property is crucial to Theorem 2 of this 

paper. 

The space 3C includes many types of operators familiar to those in 
communication and control theory. Linear time invariant convolution 
operators whose kernels are either in Li(0, °°) or L z (0, <») are in 3C. 
If these operators are cascaded with a memoryless nonlinearity having 
bounded slope, the composite operators are also in 3C. Operators des- 
cribed by certain nonlinear dynamical systems are also in 3C. Let x t 
£*(«) be the input to the following dynamical system and let y be the 
output: 

q(l) = f(q(t)^(t),t), #) = 

/ : R n X R X R -> R n 

2/(0 = g(q(t)) 

g :R" -* R 
with 

\g(q) | £ K t | q |, | Kq, x, t) \ £ K a \ q \ + K a \ x\ 

for all q t R n , | x \ < s, t > 0. Assume also that for each x e LM there 
exists a solution to the differential equation. Then, via the Bellman- 
Gronwall inequality we see that 



Hence, 



and 



a 



\q(t) | ^ #3 f e K ' { '- T) \x(t) I dr. 

•'O 

q(t) | 2 dtj g K 2 K 3 [f x\t) dt 



(f I y(t) |" dtj ^ K,K 2 K, || PfX \\t . 
Thus, the operator described is in 3C with norm bounded by K^KzK* . 
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Subsets of 3C will be used to represent the a priori information in an 
identification problem. We call a subset 2) of 3C determinable if every 
member of 2D can be identified. The determinability of a subset 2D de- 
pends of course on our definition of identification. We would like to 
consider only those identification procedures which could theoretically 
be implemented in real time. The identification procedures which we are 
concerned with must have the following properties. To identify H we 
must be able to 

(i) choose a finite observation interval, 
(ii) select an input function with constrained peak value, 
(in) perform linear measurements on the noisy observations generated 

by this input, and 
(iv) operate on these measurements to yield an estimate of H which is 
a continuous function of these measurements, 

so that our estimate of H is close to H with high probability. 

The properties of such an identification procedure are physically 
very appealing. We obviously must be able to identify within a finite 
period of time. The peak value restriction is the usual kind of input 
constraint used in communication theory. Linear measurements are 
easily implemented and tend to reduce the sensitivity to unknown 
biases as does the continuity requirement on the estimate. Finally, 
we are usually satisfied to identify to within a small tolerance. 

For H t 3C and channel model given by equation (1) we may specify 
our definition of identification even further. A linear measurement over 
the time interval [0, T] is a finite collection of bounded linear func- 
tional (pi , w), i = 1,2, • • • N, Pi c L 2 [0, T] defined when P T w t 
L 2 [0, T] and 

w(t) = [Hx](t) + z(t), ^ t ^ T 

is the received waveform with H t JC, x t L„(s). We say that a class 
2D C 3C of channel operators is determinable if given arbitrary positive 
constants e and -q, there exists a finite observation interval [0, T], an 
input (test) signal x t L.Cs), a linear measurement [(pi , iv), (p 2 , w), • • • , 
(p N , w)} over [0, T], and a continuous function g : R N — > 3C such that 
for each H t ID, 

Probability (|| H - 6 \\ > e) < r, 

where // = </[(p, , w), (p a , «0, • • • . (p* , »)]■ Thus, if © is determinable, 



t The symbol (/, h) is used to represent the inner product in LAO, T] ; that is. 
(/, h) = "J„ r i(t)h(t) dt. 
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we can "identify" any element of 3D to within any specified accuracy 
with sufficient processing and long enough observation time. 

The bulk of this paper is related to answering the following question. 
What structure must 3D have in order to be determinable? Theorem 1 
derives sufficient conditions on 3D in order to be determinable. The key 
condition is compactness. Theorem 2 indicates that this condition is in 
fact necessary for determinability. A number of corollaries are given 
which interpret these results for the case where 3D is composed of linear 
convolution operators. 

III. SUFFICIENT DETERMTNABILITY CONDITIONS 

Despite the generality of our class of operators and the rather rigid 
nature of allowable identification schemes only two conditions guarantee 
the determinability of a subset of operators. Both conditions are some- 
what obvious. One condition insures that the class may be approximated 
closely by a finite number of elements; the other insures that a test 
signal exists which will produce sufficiently dissimilar responses for 
dissimilar channels. These conditions are rigorously stated in Theorem 1. 
Theorem 1: Let 3D be a subset of 3C having the following properties: 

(i) the closure of 3D is compact (thus 3D is also bounded; that is, there 
exists a constant R > 0, such that || H - K || < R for all H, K e 3D) 

(it) given any 8 > there exists an unbounded sequence {T-, } , a sequence 
of inputs Xi e Loo(s) and a positive number r such that 

||P Tl (Hx, - Kx ; ) |B > rT, 

for all pairs H, K e 3D for which || H - K || ^ 8. Then 3D is a determinable 
subset of 3C. 

Proof: Since the proof of this theorem is lengthy, we give here a brief, 
rough description of the key steps involved which the reader may use 
as a guide through the mathematical details. 

(t) Using (it) of Theorem 1 we select an input x t to give sufficient 
separation of outputs over [0, 7\] for sufficiently dissimilar channels. 
(it) We then approximate the class 3D to within a judiciously chosen 
accuracy by a finite number of elements. 

(Hi) The actual received signal due to the input selected in (i) of 
this proof is correlated over [0, 7\] with the calculated outputs of the 
channels selected in step (it) of this proof. 

(iv) If one of these correlations is larger than the others by some 
amount we select as our estimate the corresponding element of the 
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approximating class that yielded this correlation. If there is no such 
correlation we assign an arbitrary rule so as to make the identification 
procedure a continuous function of the correlated values. 

(v) We finally show that as i (and hence T t ) increases, the proba- 
bility that there will not be a correlation larger than the others by some 
prescribed amount goes to zero. In addition, we show that the proba- 
bility that our identification procedure yields an estimate which is 
further apart in norm from the actual channel than is desired is vanish- 
ingly small as i increases. 
The formal statement of the proof follows below. 

We may assume that 3D is closed, since subsets of a determinable set 
of channels are determinable. Using assumption (ii) of Theorem 1 
with 5 = 3e/4 we have that there exists an unbounded sequence {7 1 ,}, 
a positive number r, and for each i an input signal x, e L»(x) such that 
for all pairs H,KtT> with \\H - K\\ > 3e/4 

|| P Ti {Hx ; - Kxd ||; ^ rTi . (2) 

In what follows we will denote the operator which we wish to identify 
by H. Since 3D is closed, by assumption (i) of Theorem 1, it is also com- 
pact and hence totally bounded (see for example Ref. 6, p. 22). There- 
fore, given any T t £ { 2\) we can choose a finite number of balls of radius 
r = min {r*/2s, e/4} with centers H a t 3D, a = 1, 2, • • • , M to cover 3D. 
There may be operators H t ,H k t {H a } for which || P Ti (H,Xi — H k x { ) || 2 
= 0, in which case retain only the H a 's with the lowest subscript. Thus 
we have a subset of {//„ } which we label {Hp} for which || P Ti {HiX { — 
H k x { ) 1 1 3 > Bi > for some 0, and all H t ,H k z {H & }. For convenience 
order the [H ] so that \\H - H p \\ ^ 3e/4 for /S = 1, 2, ■ • • , N - 1 
and || // - Hp || > 3e/4 for = N Q , N Q + 1, ■ • • , N, N £ M . 

We can now choose an appropriate linear measurement over the 
interval [0, T,]. We define the linear measurement m(w) = {f(w, 1), 
1(w, 2), • • • }(iv, N)}: j(io, 0) = (w, 2Hf,Xi), = 1, 2, ■ ■ • N where the 
inner product is defined over the interval [0, T t ]. Thus for each received 
waveform w{l), the linear measurement gives us a point in R N ' . From 
this measurement we will determine an estimator function g : R' — > X. 
We first partition R' s into N + 1 disjoint subsets: A x , A 2 , • • • , A N , B, 
with 

Aj = [a = (a, , a 2 , • • ■ a N ): a, - a t > (HjXi , #,£,) 

- (H kXi , H kXi ) + $,/T t , k = 1, 2, • • • N, k^j] 

and B the remainder of R s , 
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B-lUA, ■ 



i-i 



The disjointness of the above subsets of R N is easily verified by making 
use of the fact that 0,-/7\- > 0. The estimator function is defined in 
terms of this partition: 

g(m) = Hj if m(w) t Aj 

.V 

9(m) = X! cti(w)Hi if m(w) t B 
where* 

Ud(m(w), A,) 

ai(w) = d(m{w),X) + U.d(m(w),A t ) 

and 

d(x, A) = inf | x — y \. 

It is not difficult to show that g is a continuous mapping from R N into 
3C. Having given the identification scheme we now show that for any 
H e OC, e > 

P{\\H - ft || > e) -> 0. 

Ti — m 

Recalling the definition of B, A , and the labeling convention we have 
used, we see that 



P\\\H - g{m{w)) || > e} ^ P{m(w)zB\ + P\m(w) z U A, 

= plm(w) t Q A.y> + P\m(w) e A,|- (3) 

Let us first concentrate on obtaining bounds for the first term on the 
right side of equation (3). We rewrite A t as A, = (Uwy W where 

F ik = {« = (ai , Oa , ■ • • , a N ): aj - a k ^ (fl>, , ff,*,) 

- (//,.r, ,7/,.r,) + 0,/T,}. 
Thus 



t It turns out that the form of oa(v>) is irrelevant since we show that P[m{w) t B] 
vanishes as 7\- increases. It is merely included to make the estimator function 
continuous. 
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/-I >-l k*i 

Applying DeMorgan's rules to equation (4), and after some thought, 
we see that 

,V (.V-I)' v 

C\A)= U D, (5) 

where D t has the form 

d, = f u , hf 2(i r\ ••• nf m , 

with Z, ^ j for all j. We can upper bound P\m(w) eD,I by 
sup pI-y 1 + ^' :c ' ' H - Xi) ~ W^ ' HkXi) = K w ' k) " /( ™' j) 



i -- j 



^ ^ + (//,r. , //,r.) - (H, Xi , #,*,)}« (6) 

To see this, define q(w, k) = f(w, k) - (H k x { , H# t ). Then P\m(w) t D t } 
is the probability of the N events q(w, 1) - q(w, h) ^ 0,/7\- , g(w, 2) - 
q(io, l 2 ) ^ 0,/r* • • • q{w, N) - q(io, l N ) ^ d { /T { occurring simul- 
taneously. Suppose U = k. Then consider the two events q(w, 1) — 
q(w l , l x ) = q(w, 1) - q(w, k) ^ dJT, and q(w, k) - q(w, l k ) g 0,/7\ . 
If l k = 1, then these two events are contained in the event — 0,/7\- ^ 
q(w, 1) - q(w, k) ^ Bi/Ti . If h = j 9* 1 then consider the three events 

q(w, 1) - q(w, k) ^ 8 t /T t , 

7(m), fc) - q(w, j) ^ Bi/Ti , 

q(w, j) - q(u\ I,) ^ Bi/Ti . 

If l,- = 1, then these three simultaneous events are contained in the 
event — BJT ^ qiio, 1) - q(w, j) ^ 20,/T, . If I, = k, then these three 
simultaneous events are contained in the event — 0,/T ^ q{w, k) — 
q(w, j) ^ Bi/Ti . Continuing in this fashion we obtain the bound in 
equation (6). 

Since q(io, k) — q(w, j) is gaussian, we can bound the value of the 
expression in equation (6) quite easily. 

Let 

a ki = E[q(w, k) — q(w t j)\ 

= || P Ti (Hxi - H t x t ) \\l- || P Ti (HXi - H&i) | |a (7) 
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and 

ff l = Var [q(w, k) - q(w, j)] = 4 \\P Ti (H&i ~ H&<) \\l > *$ • ( 8 ) 
Hence, 

J Ne < < *> i n i s < Ne '\ 

p r-~rf = i( w ' /c > ~~ 9(w, i) ^ -jrj 

/NBi/TiOki-aki/oki I 2\ 

exp (--)<& 

/Nti/TUki I 2\ 

exp(--)dz 

r Af/2r, / 2\ fM/2Ti / 2\ 

<i (2*)" 1 / exp --) dz fS (2r)~» / exp ( --r J & 

J-W/2TO V " i/ •'-(W/2V,} V -' 

(recall that N ^ M). 
Using equations (9) and (5) we see that 



(9) 



nM/Ti I 2\ 

< (M - D'W 1 exp (-%) *. (10) 

•>-(M/Ti) > «' 

Since the right side of equation (10) goes to zero as 7\ increases we can 
choose a 7 1 e { T, } large enough so that this term is less than t?/2. We now 
bound the second term on the right side of equation (3) : 



P\m(w)z \J A § \= Y,P\m(w)tA,\. (11) 

Recall that 

Hence 

p{m(w) e U Art = X) P ^(«0 E O *") ■ ( 12 ) 

Observe that for all /c ^ j 

P{®(w) e H ^t) ^ P\m(w) * ^1 (13) 

k*i 

= p{ 5 (tc j) - q(w, k) > |j| (14) 
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.( 



79-tf exp ( - Z -\ dz. (15) 



Since 33 was covered by balls of radius r , there exists at least one inte- 
ger & < N such that || H — He || < r ^ e/4 and hence || P Ti (Hxi — 
HeXi)\\ 2 < rls 2 Ti . Note also that since || H s - H \\ > 3e/4 for j ^ N , 
|| P Tt (HXi - HiXt) || 2 > rT 7 , . Hence, 

-a i£ = || P Tt (Hx t - H,x { ) \\l- || P Tt (Hx t - Hexd \\l 

^ rT t - rVT t ^ (r - ^jT t = fr2\ . (16) 

Recalling that 3) was bounded, 

■fc = 4 || P r <(ff<s< - #***) || 2 ^ 4BVTi . (17) 

Using equations (16) and (17) in equation (15) we see that 

P{m(w)tr\F c ik } ^P{m(u)tFU} ^ f° (2t)"» exp (-^J dz. 

IMj •»3rr,-*/16JI« > «' 

(18) 
Hence from equation (11) we see that 

p\m(w) t U A.\ ^ M [" (2x)"* exp (-*-) dz. (19) 

Thus we can select a T c { 7\- } so that this term is less than ?j/2. This 
T makes P{ \\H - H || > e} < tj for all # e 3D. 

The identification technique proposed in the above proof is not 
necessarily a practical technique. Our intent is to indicate the possibility 
of identification rather than to derive easily implementable techniques. 
Notice, however, that since the measurements are linear functionals on 
L 2 (0, T) they are iterative in nature because of the integral representa- 
tion of such functionals. 

Theorem 1 gives sufficient conditions for determinability. Theorem 2 
indicates that some of these conditions are in fact necessary for identi- 
fication. 

IV. NECESSARY DETERMINABILITY CONDITIONS 

In this section we show that the approximability condition given by 
condition (i) of Theorem 1 is in fact necessary. We also show that a type 
of separation property is necessary, although it is not as strong as that 
given by condition (it) of Theorem 1. 
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Theorem 2: Let 3D be a bounded, determinable subset of 3C, then 

(i) the closure of 3D is compact 

(ii) given any 8 > there exists an x e L w (s), f > and a positive 
number r(5) such that 

|| Pt(Hx - Kx) ||; > r(5) for all H, K e 3D 

satisfying || H — K || ^ 5. 

Proof: (i) Given e > 0, choose t, £ e L n (s), # linear measurements and 

an estimator g(m(H, oi)) so that 1 

P{\\H - g(w(H, a.)) || < e/2) > | foraU # e 3D. 

Since 3D is bounded and the measurements are linear, there exists a 
compact ball Bf c R so that 

P{m(H, w) e 5|} < i for all // e 3D. 

Thus, since is continuous, g(Bf) is compact. We can therefore cover 
Bf by a finite number of balls of radius e/2. If g(Bf) D 3D we could 
also cover 3D by the balls. We don't have enough information to verify 
that giBf) D 3D. Notice however that 

P{[w: || H - g(m(H,a)) \\ > e/2] n [»: m(ff,«)«5f]) 

= P{w: || // - g(m(H,w)) \\ > e/2} + P{w: m(H, «) e B f \ 

-P{[«: || H - g(m(H,cS)) \\ > e/2] U [«: m(F,a.) £ Bf]| 

8! * + f - 1 - f (2°) 

We conclude that there exists an w so that m(H, w ) e Bf and || H — 
g(m(H, w )) || < «/2. We can repeat this argument for each H e 3D. 
Therefore, 3D must lie within an e/2 neighborhood of g(Bf). By expand- 
ing the balls of radius e/2 which cover g{Bf) by a factor of two, the 
expanded balls will also cover 3D. Since this argument holds for any 
e > 0, 3D is shown to be totally bounded. Since 3C is complete, 3D is 
complete; and hence 3D is compact (see Ref. 6, p. 22). 

(ii) If 3D is determinable, then the closure of 3D, 3D, is also determin- 
able. This is easily shown by noting that any channel in 3D can be 
approximated arbitrarily closely by a channel in 3D. Hence the measure- 
ments will be arbitrarily close and because of the continuity of the 
estimate, the estimate will be close with high probability. 



t Since the measurements are gaussian random variables we have included the 
dependence on the sample points w of the corresponding sample space Q. 
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Since ID is determinable, for every e > there exists an observation 
interval [0, T], a test signal x c L x (s), and an estimator g(m(- , a>)) so that 

P{\\ H - g[m(H, w)] || < 5/2) > | for all H t £>. (21) 

Suppose that \\ Pt(Hx — Kx) || 2 = 0. Then, the measurements ob- 
tained will be the same irrespective of whether H or K were used and 
therefore the estimates for K and H will be identical. Since 





P(a>: \\H - 


• g\m(H,t*)]\\ < 5/2} > ! 




and 










P{a>: || K - 


g[m(H,w)}\\ < 5/2) > |, 




we see that 








P{[«: \\K - 


- g[m(H, «)] 


II < 5/2) 








n ( M :||J5r - 0[m(2T,a,)] || 


< 5/2]} 






-P{«:||*- ff[m(H,«)] | 


1 < 5/2} 






+ P{a>:||tf - ff[®(ff,«)] | 


1 < 5/2} 






-P{[ a :\\K - g[m(H,u)} 


II < 5/2} 






U {co: || ff - 0[m(ff,«)] || 


< 5/2]) 



> 3. _1_ S _ 1 _ 1 

= 414 * 2- 

Thus there exists at least one sample point o> such that 

|| K - g(m(H, «„)) || < 5/2 
and 

|| H - g<m(H, Uo )) || < 5/2 

which together imply that \\ H - K \\ < 8. If H, K e 35 and || H - 

K || > 5 then || Pf(#x - tfi) || 2 > 0. 

Note that S X S is compact in the product topology and hence 
C(5) = { (H, K): || H - K || ^ 5, #, K t T>\ is also compact. The func- 
tion /(//, K) = || Pf(Hx — Kx) || 2 is a continuous map of C(5) into the 
real line and hence it has a minimum value. This minimum value cannot 
be zero because we have already shown that f(H, K) > for (H, K) t 
C(5). As a consequence, there exists a positive number r(5) such that 

|| P f (Hx - Kx) \\\ > r(8) for all H,Kt® 

satisfying \\ H — K \\ ^ 5. 
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V. LINEAR CONVOLUTION OPERATORS 

When we specialize the results of Theorems 1 and 2 to linear convolu- 
tion operators, it is possible to obtain the characterization of the deter- 
minable sets in terms of the kernels of these operators. These results 
are given in Corollaries 1, 2 and 3 below. We note that the resulting 
conditions are similar to those obtained by Root and Prosser for the 
deterministic identification problem. 1 

Corollary 1: If X. is composed only of causal linear time invariant con- 
volution operators H, [Hx](0 = Jo h(t - t)x(t) dr, h e Li(0, ») and if 

(i) 33 = {h | h e Li(0, °°), H c 33} has a compact closure in Lu(0, «), 
and 

(ii) for each 5 > there exists an x t L„(s), T > such that || P T Hx — 
P T Kx || 9 >0/oraMh,ke $> for which || h - k ||, = J" | h(t) - k(t) | dt^ 
5 then 33 is determinable. 

Necessary and sufficient conditions for 33 to have a compact closure 
are (see Ref. 6, pp. 298-299): 

(i) 33 is a bounded subset of I/,(0, °° ), 

(ii) lim r _ /" I Ht + r) - h(t) \ dt = uniformly for h e 33, and 
(Hi) lim r -,oo /" J h(t) | dt = uniformly f or h e 33. 

Proof: We first show that if the closure of 33 is compact then the 
closure of 33 is compact in the respective topologies. Let || H ||*, He 3C 
denote the usual operator norm, that is, 

ll rr II* B.% I 2 

ll#ll* = sup ' | | ' • 

xt/,,(0,oo) | | * ||2 

Given any e > there exists T* > 0, x* t LM such that 

,,„.,< , \\P T -Hx*\\ 2 <r \\P T .HP T .x*\\ 2 

Note however that P T .x* t L 2 (0, co); hence \\H\\ ^ e + || H ||* for 
arbitrary e > 0, so 

|| #|| ^ || IT ||*. (24) 

Using the linearity of H and Holder's inequality we see that 

|jff||*- sup W*S sup lliW-lk.pn,. (25) 

*«L,(0.m) || ^ \\2 * t L,(0,w) || * l|2 



NOISY CHANNEL CLASSIFICATION 3279 

Thus compactness in Li(0, *>) implies compactness in 3C and condition 
(i) of Theorem 1 is satisfied. 

Given 5 > 0, choose x„ , T° so that condition (ii) of Corollary 1 is 
satisfied. We have already used the fact that || HP T »x„ || 2 ^ || h || x " 
||P r «&. ||a. Hence HP T *x is a continuous linear mapping (that is, 
mapping the kernels into time functions) from Li(0, °o) into L 2 (0, *>). 
Thus the image of 2D under this mapping has a compact closure. We can 
therefore choose a number t > T" so that 

r (HP T .x - KP T °x )\t) dt<\ for all H,KtT). (26) 

Define x as follows: 

f (0 = x (t) for o < / ^ r 

= for T° < t ^ f 

= x (t - f) for f < t^ f + T 

= for f + r<tg 27' 

= x (t - nf) for nf < / ^ raf + T 

= for nf + T" < t ^ (n + l)f 

: . (27) 

Note that z e L«o(s). Following the same line of reasoning as in the proof 
of condition (ii) of Theorem 2 we can show that there exists an r(8) > 
so that || P T .(Hx - Kx ) \\l > r(8) for bRH,Kz S for which \\h-k ||, 
> 5. We now proceed to show that 

|| P nf (Hx - Kx) \\l > f(b)nf (28) 

where r(&) = r(b)/±f. Let y (l) = [HPt^o - XPr-ZoKO and ?/,(0 = 
?/o(^ — if). Then, by linearity and time invariance, 



f (Hx - Kxf(t) dt = f _ [y„(0 + 2/i(0 + • • • + 2/.--I (01 



and 



y)(t) dt = / 2/o(0 dt for j ^ *. 



2 dt 

(29) 
(30) 
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Using these relationships we see that 

/(i + nf 
(y n + • • • + ViY dt 
.f 

/t « + 1 > f pd+i)f 

yUt-2 \y t | (| 2/o | + ■•• + | »*-, |) * 

. T * iT 

ri d/[i - 2 | ? (t/S + • • • + yU) dtj 

y3 tttl-2 / yl 



...dt 

J if L t/f 

Hence 



^ r(8)/2. (31) 



|| P nf (Hx - Kx) |B - f r f£ <U + P* (y + t/,) 2 * + ■ ■ ■ 
j jf 

+ f (Vo + ■ ■ • + y«-i) 2 rf/ 

■Mn-l)f 

^ nr(*)/4 = r'(6)n:f . (32) 

We see that this relation implies that condition (ii) of Theorem 1 is 
satisfied; thus 3D is determinable. 

When 3C is composed only of causal linear time invariant convolution 
operators we can also strengthen the conclusion of Theorem 2. This 
result is given in the following corollary. 

Corollary 2: If 3C is composed only of causal linear time invariant 
convolution operators and if 5) is a determinable subset of 3C then 

(i) given any 8 > there exists an unbounded sequence T { , a sequence 
ofinputsx,t'L ao (s)andapositivenumberr(8)suchthat\\ P Ti (Hxj — Kxj) 2 1 1 2 
> r(5)Ti for all pairs H, K e 5 for which || H — K || > 8. 

Pi-oof: As a consequence of Theorem 2 we know that for any 8 > 
there exists an f e L„(s), f > and a positive number r(8) such that 
|| Pf(H£ - Kx) 2 || a > r(8) for all #, X *• S satisfying \\H - K\\> 8. 

Obviously, || HPfx || 2 < || H \\ \\ Pf£ || 2 . Hence HPfX is a continuous 
linear mapping from 3C into L 2 (0, <»). Thus the image of 33 under this 
mapping has a compact closure. We can therefore choose a positive 
number T > T so that 

[ {EPfX - HP f x)\t) dt <\ for all H,KzT>. (33) 
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Proceeding as in the proof of corollary 1 we can easily establish (i) of 
Corollary 2. 

Corollary 3 : If X is composed only of causal Hilbert- Schmidt operators H, 
[Hx](t) = /S h(t, r)x(r)dt, J" ft | h(t, r) | 3 dt dr < a, , h(t, r) - /or 
T > t OTld l/ 

(t) 3D = {h | H e 3D} /<as compact closure in the Hilbert- Schmidt 
metric (|| h - k ||» - Jo Jo I h(t, r) - k(t, r) | 2 dt dr) 

(w*)/or eacft 5 > there exists an unbounded sequence T; , a sequence 
of Xj e L x (s) a??d a positive constant r(5) so tftat || P Ti (Hxj — KXj) || 2 > 
r(8)T, /or ati h, k e © /or wtocfc || h - k || 2 > 8. 

Then 3D is determinable. 

Proof: As in the proof of Corollary 1 we can show that \\ H \\ < \\ H \\* 

where 

II Hx \U 



xtZ.,(0.oo) || «*• | |2 

From the Schwartz inequality we see that 

|| Hx \\\ - f (/' fc(*. r)*to)' rf/ = [ (f W, t)x(t) dr) dt 

^ || A ||i II silt. ( 34 > 

which implies that 

|| JET H < || A|| 2 . (35) 

Hence, compactness of 3D imphes that 3D is compact and condition (i) 
and (ii) of Theorem 1 are easily verified to hold. 

VI. COLORED NOISE 

Theorems 1 and 2 were derived for the case when z(l) the additive 
noise was a zero mean white stochastic process. The situation when 
Ez{t)z(r) = R(t, t) can be handled in a similar fashion. The only addi- 
tional assumptions are: 

(i) R(t, t) is positive definite; that is, 

I I /?(/, t)w(0w(t) dt dr > for all u;eL 2 (0, ») 
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satisfying /" | w(t) | 2 dt > 0, and either 
(ii) R(t, t) is Hilbert-Schmidt; that is, 

r f° | R(t, r) I 2 dtdr = C 2 < oo , or 

Jo "0 

(Hi) if R(t, t) = R (t - t) then 

f | R (t) \ 2 dt = Cl< oo . 

Inspecting the proof of Theorem 1, one sees that the whiteness assump- 
tion was only used in equations (8) and (17). If Ez(t)z(j) = R(t, t), 
then equation (8) becomes 

a 2 ki = Var [q(w, k) - q(w, j)] 

= 4 f * f T ' R(t, T)(H* t - H,zM'(H# t - H,x t )(T) dtdr. (36) 

Since H k and #, were chosen so that || P Ti (H k Xi — fl><) || > 0, we see 
that since R (t, t) is positive definite, <r 2 , > 0. If we choose 0< to be less 
than min,-.* a 2 jk instead of || P Ti (H k Xi — HjX t ) \\l , inequality (9) will 
remain true. 

Equation (17) is changed as follows. If Ez(t)z(r) = R(t, r), then by 
the Schwartz inequality 

crjj - 4 r f T< R(t, r)(H iXi - H*M<M&i ~ H k x x ){r) dtdr 
Jo Jo 

g 4 f (H iXi - EgBd(r)y™ | R(t, r) | 2 dt] 



(H t x t - H k x t )\t) dt) dr 



=£ 4C/eVT,- . (37) 

On the other hand, if Ez(t)z( T ) = R (t - r), equation (17) is changed 
as follows. 

^ = 4 f' r E(t - T )(P&i - HgtMWPi ~ H k Xi)(r) dt dr 

Jo Jq 

^ 4 f' | (H iXi - H&M | f^' | B(/ - r) | 2 dr J 
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^ 4C Rs(T ^(f' dtj J** (H fXi - H& % )\t) d/J 

;g dCoBsCr,) 1 -^)*-^^) 1 = CofiVTl . (38) 

From equation (37) we see that the limit of integration in equation (19) 
now becomes ZrT\/ARs&. If we use equation (38), this limit becomes 
?>rT\/\§RsC\ . In either case this limit diverges as i increases. Thus 
Theorem 1 is still correct if the noise is colored. One can also see that 
Theorem 2 is true without any modifications. The whiteness assumption 
does enter into the proof in any substantial manner. 

VII. CONCLUSIONS 

In this paper we have attempted to formalize the notion of identifica- 
tion and examined conditions under which the a priori information would 
guarantee that the problem of identification was well formulated. Our 
purpose has been to indicate when identification was possible and not 
to specify a given identification procedure. It is hoped that the condi- 
tions derived here may motivate researchers to consider larger classes of 
identification problems than have hitherto been examined and also to 
indicate for what classes of problems identification is not possible. 

VIII. ACKNOWLEDGMENTS 

The author wishes to thank J. M. Holtzman and P. P. Varaiya for 
their helpful discussions and criticisms. He also wishes to acknowledge 
support for the initial phases of this work from NASA under Grant 
NsG-354, Supplement 4 while at the University of California at Berkeley. 

APPENDIX 

Proof that the Space 30. Is Complete 

In this appendix we show that the space 3C with the metric induced 
by its norm is a complete space. If {//„} is a Cauchy sentence in 3C, we 
show that there exists an element H e ."t€ such that lim „_.«, 1 1 H — H n \ \ =0. 

Let \H n ) be a Cauchy sequence in 3C. Then given any e > there 
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exists a number N(e) such that \im,n > N(e), \\ H n — H m \\ < e. From 
the definition of the metric, 

|| H„ - H m || ^ \ \PrH n x- PrH m x || 2 (3g) 

I I * T^ I 1 2 

for all T > 0, x t LM, || P T x || 2 ^ 0. Using the definition of L x (s), 
es(T) 1 > e || P T x || 2 > || P r (#„z - #-*) || 3 (40) 

for all n, m > N(e), T > 0, x t LM, \\ P T x || 2 ^ 0. Thus, for each 
T > 0, z e L H (s), || P T x I 2 ^ 0, f# n :r} is a sequence of functions in L 2 . 
and for each T > 0, P T H n x is a Cauchy sequence in L 2 [0, T]. Hence, 
for each T there exists at least one time function y T e L 2 , such that 
P T i/ r c L 2 (0, 00) and lim,,-^ \\P T H„x — P T y T [ 1 2 = 0. Furthermore, 
y T is uniquely (except for a set of measure zero) specified over [0, T\. 
Because of this uniqueness, if 7\ < T 2 , then Pt.Vt, = PrAfr, • Hence 
there exists a unique function y e L 2e such that P T y = P T yr for each 
T > 0. This function can be constructed: 

y(t) = y x (t) for OgKl 

= y 2 (t) for 1 ^ / < 2 



= ?/„(/) for n — 1 ^ t < n 

(41) 

For each x e L„(s), x^0 we have uniquely specified a function y e L 2t . 
For x = we arbitrarily put # = 0. Call the operator defined by this 
association H; that is, Hx = y. We now show that hm„_. 00 1 1 a. — H n \ | =0. 
For each T > 0, x e L m (s), || P T x || 2 ^Owe can use the triangle in- 
equality to show that 

\\P T (Hx- H n x) \U < \\P T Hx-P T H m x\\ 2 \\P T (H n x-H m x) \\ 2 
IIP-.- II = IiPtII ' IIPtII 

I I I f.l [ [ ■> I I i 7-U/ I 1 2 II* 7'* I |.! 

(42) 

If H n , # m are members of the Cauchy sequence, from our previous 
development we know that there exists a number N(e/2) independent 
of x and T such that 

\\ P t(H^- H m x) || a < e/2 for m n > N{e/2y (43) 

r T x 2 
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Since lim,,,-^ || P T (Hx — H,„x) || 2 = we can find another number 
N*(e/2, x, T) > N(e/2) such that 

l|Fr( ^P~ ff mY) l|a < '/2 for m > N*(t/2, x, T). (44) 

r T X ■> 



Hence for all T > 0, P T x 7* 

\\P T (Sx - H n x) 



P T X 



< e for n > N(t/2), (45) 



and if H were causal it follows that H t 30. with lim \\ H — H n \\ = 0. 
The causality of H is easily established. For each x t L^is), T > 0: 

|| P T Hx - P T HP T x ||, 

^ || P T Hx - P T H n x ||, + || P T HP T x - P T //„.r || 3 (46) 

= || P T (Hx - H„x) II,, + j| P T (HP T x - H n P T x) ||, . (47) 

For ?* sufficiently large each term on the right side may be arbitrarily 
small, hence || P T ffx - P T HP T x ||, = for all x c L n (s), T > 0. 

If 5C is composed only of linear operators the completeness proof 
follows as above except to additionally observe that H is linear. 
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